AITopics | systematic bias

Collaborating Authors

systematic bias

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tight Sample Complexity Bounds for Best-Arm Identification Under Bounded Systematic Bias

Qian, Tianhao

arXiv.org Machine LearningApr-21-2026

As search depth increases in autonomous reasoning and embodied planning, the candidate action space expands exponentially, heavily taxing computational budgets. While heuristic pruning is a common countermeasure, it operates without formal safety guarantees when surrogate models (like LLMs) exhibit systematic evaluation biases. This paper frames the node expansion process as a localized Best-Arm Identification (BAI) problem over dynamic frontiers, subject to a bounded systematic bias $L$. By inverting the Lambert W function, we establish an additive sample complexity of $\mathcal{O}((Δ-4L)^{-2})$, which indicates that safe node elimination is only feasible when the empirical reward gap exceeds $4L$. We complement this with an information-theoretic lower bound of $Ω((Δ-2L)^{-2})$ to confirm the structural limits of biased search. Subsequent evaluations on both synthetic trees and complex reasoning tasks demonstrate that adhering to this local safety boundary successfully preserves optimal trajectories while maximizing sample allocation efficiency.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Machine Learning

2604.14345

Country: Asia > China (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

e664650506f1cf2b4696df892147c06e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 12:16:27 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance > Credit (0.47)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Data Science (0.67)

Add feedback

Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions

Neural Information Processing SystemsOct-10-2025, 19:50:59 GMT

A common practice to address this issue is using the model itself to annotate unlabeled data samples.

acceptance rate, agent, dataset, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance > Credit (0.47)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

The Token Tax: Systematic Bias in Multilingual Tokenization

Lundin, Jessica M., Zhang, Ada, Karim, Nihal, Louzan, Hamza, Wei, Victor, Adelani, David, Carroll, Cody

arXiv.org Artificial IntelligenceSep-9-2025

Tokenization inefficiency imposes structural disadvantages on morphologically complex, low-resource languages, inflating compute resources and depressing accuracy. We evaluate 10 large language models (LLMs) on AfriMMLU (9,000 MCQA items; 5 subjects; 16 African languages) and show that fertility (tokens/word) reliably predicts accuracy. Higher fertility consistently predicts lower accuracy across all models and subjects. We further find that reasoning models (DeepSeek, o1) consistently outperform non-reasoning peers across high and low resource languages in the AfriMMLU dataset, narrowing accuracy gaps observed in prior generations. Finally, translating token inflation to economics, a doubling in tokens results in quadrupled training cost and time, underscoring the token tax faced by many languages. These results motivate morphologically aware tokenization, fair pricing, and multilingual benchmarks for equitable natural language processing (NLP).

accuracy, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2509.05486

Country:

North America > United States > New Mexico (0.14)
North America > Mexico > Mexico City (0.14)

Genre:

Research Report > New Finding (0.70)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The World As Large Language Models See It: Exploring the reliability of LLMs in representing geographical features

Abbasi, Omid Reza, Welscher, Franz, Weinberger, Georg, Scholz, Johannes

arXiv.org Artificial IntelligenceJun-3-2025

As large language models (LLMs) continue to evolve, questions about their trustworthiness in delivering factual information have become increasingly important. This concern also applies to their ability to accurately represent the geographic world. With recent advancements in this field, it is relevant to consider whether and to what extent LLMs' representations of the geographical world can be trusted. This study evaluates the performance of GPT-4o and Gemini 2.0 Flash in three key geospatial tasks: geocoding, elevation estimation, and reverse geocoding. In the geocoding task, both models exhibited systematic and random errors in estimating the coordinates of St. Anne's Column in Innsbruck, Austria, with GPT-4o showing greater deviations and Gemini 2.0 Flash demonstrating more precision but a significant systematic offset. For elevation estimation, both models tended to underestimate elevations across Austria, though they captured overall topographical trends, and Gemini 2.0 Flash performed better in eastern regions. The reverse geocoding task, which involved identifying Austrian federal states from coordinates, revealed that Gemini 2.0 Flash outperformed GPT-4o in overall accuracy and F1-scores, demonstrating better consistency across regions. Despite these findings, neither model achieved an accurate reconstruction of Austria's federal states, highlighting persistent misclassifications. The study concludes that while LLMs can approximate geographic information, their accuracy and reliability are inconsistent, underscoring the need for fine-tuning with geographical information to enhance their utility in GIScience and Geoinformatics.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.00203

Country: Europe > Austria > Tyrol > Innsbruck (0.25)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using large language models to produce literature reviews: Usages and systematic biases of microphysics parametrizations in 2699 publications

Zhang, Tianhang, Fu, Shengnan, Schultz, David M., Zheng, Zhonghua

arXiv.org Artificial IntelligenceMar-27-2025

Large language models afford opportunities for using computers for intensive tasks, realizing research opportunities that have not been considered before. One such opportunity could be a systematic interrogation of the scientific literature. Here, we show how a large language model can be used to construct a literature review of 2699 publications associated with microphysics parametrizations in the Weather and Research Forecasting (WRF) model, with the goal of learning how they were used and their systematic biases, when simulating precipitation. The database was constructed of publications identified from Web of Science and Scopus searches. The large language model GPT-4 Turbo was used to extract information about model configurations and performance from the text of 2699 publications. Our results reveal the landscape of how nine of the most popular microphysics parameterizations have been used around the world: Lin, Ferrier, WRF Single-Moment, Goddard Cumulus Ensemble, Morrison, Thompson, and WRF Double-Moment. More studies used one-moment parameterizations before 2020 and two-moment parameterizations after 2020. Seven out of nine parameterizations tended to overestimate precipitation. However, systematic biases of parameterizations differed in various regions. Except simulations using the Lin, Ferrier, and Goddard parameterizations that tended to underestimate precipitation over almost all locations, the remaining six parameterizations tended to overestimate, particularly over China, southeast Asia, western United States, and central Africa. This method could be used by other researchers to help understand how the increasingly massive body of scientific literature can be harnessed through the power of artificial intelligence to solve their research problems.

large language model, machine learning, parameterization, (20 more...)

arXiv.org Artificial Intelligence

2503.21352

Country:

Asia > Southeast Asia (0.24)
Africa > Central Africa (0.24)
Asia > South Korea (0.14)
(20 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers

Menon, Rakesh R., Srivastava, Shashank

arXiv.org Artificial IntelligenceOct-29-2024

Despite their high predictive accuracies, current machine learning systems often exhibit systematic biases stemming from annotation artifacts or insufficient support for certain classes in the dataset. Recent work proposes automatic methods for identifying and explaining systematic biases using keywords. We introduce DISCERN, a framework for interpreting systematic biases in text classifiers using language explanations. DISCERN iteratively generates precise natural language descriptions of systematic errors by employing an interactive loop between two large language models. Finally, we use the descriptions to improve classifiers by augmenting classifier training sets with synthetically generated instances or annotated examples via active learning. On three text-classification datasets, we demonstrate that language explanations from our framework induce consistent performance improvements that go beyond what is achievable with exemplars of systematic bias. Finally, in human evaluations, we show that users can interpret systematic biases more effectively (by over 25% relative) and efficiently when described through language explanations as opposed to cluster exemplars.

classifier, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.22239

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Singapore (0.04)
North America > United States > Pennsylvania (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A Systematic Bias of Machine Learning Regression Models and Its Correction: an Application to Imaging-based Brain Age Prediction

Lee, Hwiyoung, Chen, Shuo

arXiv.org Machine LearningMay-24-2024

Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased, while those for small-valued outcomes are positively biased. We refer to this linear central tendency warped bias as the "systematic bias of machine learning regression". In this paper, we first demonstrate that this issue persists across various machine learning models, and then delve into its theoretical underpinnings. We propose a general constrained optimization approach designed to correct this bias and develop a computationally efficient algorithm to implement our method. Our simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning models, our method effectively addresses the longstanding issue of "systematic bias of machine learning regression" in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.

imaging-based brain age prediction, machine learning regression model, systematic bias, (2 more...)

arXiv.org Machine Learning

2405.1595

Genre: Research Report (0.69)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.73)
Health & Medicine > Health Care Technology (0.73)
Health & Medicine > Diagnostic Medicine > Imaging (0.73)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions

Xie, Tian, Zhang, Xueru

arXiv.org Artificial IntelligenceMay-12-2024

As machine learning (ML) models are increasingly used in social domains to make consequential decisions about humans, they often have the power to reshape data distributions. Humans, as strategic agents, continuously adapt their behaviors in response to the learning system. As populations change dynamically, ML systems may need frequent updates to ensure high performance. However, acquiring high-quality human-annotated samples can be highly challenging and even infeasible in social domains. A common practice to address this issue is using the model itself to annotate unlabeled data samples. This paper investigates the long-term impacts when ML models are retrained with model-annotated samples when they incorporate human strategic responses. We first formalize the interactions between strategic agents and the model and then analyze how they evolve under such dynamic interactions. We find that agents are increasingly likely to receive positive decisions as the model gets retrained, whereas the proportion of agents with positive labels may decrease over time. We thus propose a refined retraining process to stabilize the dynamics. Last, we examine how algorithmic fairness can be affected by these retraining processes and find that enforcing common fairness constraints at every round may not benefit the disadvantaged group in the long run. Experiments on (semi-)synthetic and real data validate the theoretical findings.

acceptance rate, agent, dataset, (16 more...)

arXiv.org Artificial Intelligence

2405.08027

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
(2 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Banking & Finance (0.68)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

From Spectra to Biophysical Insights: End-to-End Learning with a Biased Radiative Transfer Model

She, Yihang, Atzberger, Clement, Blake, Andrew, Keshav, Srinivasan

arXiv.org Artificial IntelligenceMar-5-2024

Advances in machine learning have boosted the use of Earth observation data for climate change research. Yet, the interpretability of machine-learned representations remains a challenge, particularly in understanding forests' biophysical reactions to climate change. Traditional methods in remote sensing that invert radiative transfer models (RTMs) to retrieve biophysical variables from spectral data often fail to account for biases inherent in the RTM, especially for complex forests. We propose to integrate RTMs into an auto-encoder architecture, creating an end-to-end learning approach. Our method not only corrects biases in RTMs but also outperforms traditional techniques for variable retrieval like neural network regression. Furthermore, our framework has potential generally for inverting biased physical models. The code is available on https://github.com/yihshe/ai-refined-rtm.git.

ae rtm corr, biophysical variable, physical model, (13 more...)

arXiv.org Artificial Intelligence

2403.02922

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Energy (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback